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Abstract — A successful speech recognition system can help 
in many applications and environments in our daily life. It can 
help in non critical operations such as presenting the driving 
route to the driver, dialing a phone number, light switch turn 
on/off, the coffee machine on/off etc. apart from speaker 
verification-caste wise, community wise and locality wise 
including identification of sex. In this paper an attempt has been 
made to find out the Fundamental Frequency of Bodo (Vowels) 
Phonemes and observe the spectrogram and bifurcation of the 
same through the utterances of native Bodo speakers. These 
studies will help to extract the features need for speech/speaker 
recognition of Bodo Language. Bodo is a local language of the 
North-East India. 


Index Terms — Bodo Language, Bifurcation, Fundamental 
Frequency, Pitch, Speech Recognition, Spectrogram 

I. Introduction 

The opening and closing of the vocal folds that occur during 
speaking break the air steam into chains of pulses. The rate of 
repetition of these pulses is the pitch and it defines the 
fundamental frequency of the speech signal [1]. In other 
words, the rate of vibrations of the vocal folds is the 
fundamental frequency of the voice. The frequency increases 
when the vocal folds are made taut. Relative differences in the 
fundamental frequency of the voice are utilized in all 
languages to study the various aspects of linguistic 
information conveyed by it [2]. 

The general problem of fundamental frequency 
estimation is to take a portion of signal and to find the 
dominant frequency of repetition. Thus, the difficulties that 
arises in the estimation of fundamental frequency are (i) all 
signals are not periodic, (ii) those are periodic may be 
changing in fundamental frequency over the time of interest, 
(iii) signals may be contaminated with noise, even with 
periodic signals of other fundamental frequencies, (iv) signals 
which are periodic with interval T are also periodic with 
interval 2T, 3T etc., so we need to find the smallest periodic 
interval or the highest fundamental frequency, and (v) even 
signals of constant fundamental frequency may be changing in 
other ways over the interval of interest. 

In general, the fundamental frequency of the speech 
wave is estimated using autocorrelation. The mathematical 
model used for estimating the fundamental frequency is given 
below [3]: 


A discrete short-time sequence is given by 

s n [m] = s [m] w[?i — m] (i) 

Where w[n] is an analysis window of duration N w . 
The short-time autocorrelation function r n [x] is defined by 

s n = s[m]w[n — m] 

= 2™=-oc ^ M fin [m + t] (2) 

Where s[m] is periodic with period p, r n [x] contains 
peak at or near the pitch period p. For unvoiced sound no clear 
peak occurs near an expected pitch period. Location of the 
peak in the pitch period range provides a measure of pitch 
estimation and voicing decision. The above correlation pitch 
estimator can be obtained, more formally, by minimizing over 
possible pitch periods (p>0), and the error criterion is given 
by: 

E[p] 

=Zm= -DC (fi n [rn] -sJm+T]) 2 (3) 

Minimizing E[p] with respect to p yields 

max 

P= p (Em=-«sJmK[m + p]) (4) 

Where p >8, i.e., p is sufficiently far from zero. 

This alternative view of autocorrelation pitch 
estimation is used for detecting the pitch of Bodo vowels. The 
speech waveform and corresponding pitch spectra of six 
Bodo vowels have been depicted in Figure (1 ), Figure (2) for 
both male and female informants. The estimated values of the 
fundamental frequency or pitch have been given in Table (1) 
for Bodo vowels. 


Table 1: Pitch of six Bodo vowels 


SPECIME 

N 

FUDAMENTAL FREQUENCY (in Hz.) 

/a/ 

Id 

111 

/of 

lul 

/ml 

MALE 

130.001 

242.049 

148.200 

154.004 

126.003 

146.030 

FEMALE 

229.000 

381.001 

268.250 

250.900 

118.500 

251.002 
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Fig. (1): Estimation of pitch of Bodo vowel corresponding to 
Male informants (time domain) 
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Fig. (3): Graphical representation of Fundamental Frequency of 
Bodo Male and Female for vowel utterances 



II. RESULTS AND DISCUSSION 

Typically, the pitch or fundamental frequency ranges 
from 80Hz to 160Hz for male speakers and from 140Hz to 
400Hz for female speakers [4]. The formant frequencies are 
usually greater than the pitch frequency. So, in speech 
encoding, synthesis and recognition, the estimation of pitch 
and formant frequencies finds extensive use. In adult, 
generally the length of vocal folds in male is more than that of 
female counterpart. The more is the vocal fold length, less is 
the pitch frequency. Thus the pitch differs in male and 
female informants. 

In the present study, from the Table (1) and Figure 
(3), it is observed that the values of pitch or fundamental 
frequency for female informant are higher than that of male 
informants as proposed by Pinto-et-al [5]. Thus, it can be 
concluded that the pitch or fundamental frequency can be 
affectively used for the verification of sex of the Bodo 
speakers through the vowel utterances. 
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Fig. (2): Estimation of pitch of Bodo vowel corresponding to 
Female informants (time domain) 
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